Goto

Collaborating Authors

 pure strategy






XDO: ADoubleOracleAlgorithmfor Extensive-FormGames

Neural Information Processing Systems

Policy Space Response Oracles (PSRO) is a reinforcement learning (RL) algorithm for two-player zero-sum games that has been empirically shown to find approximate Nash equilibria in large games.




Perturbing Best Responses in Zero-Sum Games

Dziwoki, Adam, Horcik, Rostislav

arXiv.org Artificial Intelligence

This paper investigates the impact of perturbations on the best-response-based algorithms approximating Nash equilib-ria in zero-sum games, namely Double Oracle and Fictitious Play. More precisely, we assume that the oracle computing the best responses perturbs the utilities before selecting the best response. We show that using such an oracle reduces the number of iterations for both algorithms. For some cases, suitable perturbations ensure the expected number of iterations is logarithmic. Although the utility perturbation is computationally demanding as it requires iterating through all pure strategies, we demonstrate that one can efficiently perturb the utilities in games where pure strategies have further inner structure.